Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles

نویسندگان

  • James Cross
  • Liang Huang
چکیده

Parsing accuracy using efficient greedy transition systems has improved dramatically in recent years thanks to neural networks. Despite striking results in dependency parsing, however, neural models have not surpassed stateof-the-art approaches in constituency parsing. To remedy this, we introduce a new shiftreduce system whose stack contains merely sentence spans, represented by a bare minimum of LSTM features. We also design the first provably optimal dynamic oracle for constituency parsing, which runs in amortized O(1) time, compared to O(n) oracles for standard dependency parsing. Training with this oracle, we achieve the best F1 scores on both English and French of any parser that does not use reranking or external data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Minimal Span-Based Neural Constituency Parser

In this work, we present a minimal neural model for constituency parsing based on independent scoring of labels and spans. We show that this model is not only compatible with classical dynamic programming techniques, but also admits a novel greedy top-down inference algorithm based on recursive partitioning of the input. We demonstrate empirically that both prediction schemes are competitive wi...

متن کامل

Discontinuous parsing with continuous trees

We introduce a new method for incremental shift-reduce parsing of discontinuous constituency trees, based on the fact that discontinuous trees can be transformed into continuous trees by changing the order of the terminal nodes. It allows for a clean formulation of different oracles, leads to faster parsers and provides better results. Our best system achieves an F1 of 80.02 on TIGER.

متن کامل

A Dynamic Oracle for Arc-Eager Dependency Parsing

The standard training regime for transition-based dependency parsers makes use of an oracle, which predicts an optimal transition sequence for a sentence and its gold tree. We present an improved oracle for the arc-eager transition system, which provides a set of optimal transitions for every valid parser configuration, including configurations from which the gold tree is not reachable. In such...

متن کامل

Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank

Discourse parsing has long been treated as a stand-alone problem independent from constituency or dependency parsing. Most attempts at this problem are pipelined rather than end-to-end, sophisticated, and not self-contained: they assume goldstandard text segmentations (Elementary Discourse Units), and use external parsers for syntactic features. In this paper we propose the first end-to-end dis...

متن کامل

Multilingual Lexicalized Constituency Parsing with Word-Level Auxiliary Tasks

We introduce a constituency parser based on a bi-LSTM encoder adapted from recent work (Cross and Huang, 2016b; Kiperwasser and Goldberg, 2016), which can incorporate a lower level character biLSTM (Ballesteros et al., 2015; Plank et al., 2016). We model two important interfaces of constituency parsing with auxiliary tasks supervised at the word level: (i) part-of-speech (POS) and morphological...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016